Computing Optimal Repairs for Functional Dependencies

نویسندگان

  • Ester Livshits
  • Benny Kimelfeld
  • Sudeepa Roy
چکیده

We investigate the complexity of computing an optimal repair of an inconsistent database, in the case where integrity constraints are Functional Dependencies (FDs). We focus on two types of repairs: an optimal subset repair (optimal S-repair) that is obtained by a minimum number of tuple deletions, and an optimal update repair (optimal U-repair) that is obtained by a minimum number of value (cell) updates. For computing an optimal S-repair, we present a polynomial-time algorithm that succeeds on certain sets of FDs and fails on others. We prove the following about the algorithm. When it succeeds, it can also incorporate weighted tuples and duplicate tuples. When it fails, the problem is NP-hard, and in fact, APX-complete (hence, cannot be approximated better than some constant). Thus, we establish a dichotomy in the complexity of computing an optimal Srepair. We present general analysis techniques for the complexity of computing an optimal U-repair, some based on the dichotomy for S-repairs. We also draw a connection to a past dichotomy in the complexity of finding a“most probable database” that satisfies a set of FDs with a single attribute on the left hand side; the case of general FDs was left open, and we show how our dichotomy provides the missing generalization and thereby settles the open problem.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sampling the Repairs of Functional Dependency Violations under Hard Constraints

Violations of functional dependencies (FDs) are common in practice, often arising in the context of data integration or Web data extraction. Resolving these violations is known to be challenging for a variety of reasons, one of them being the exponential number of possible “repairs”. Previous work has tackled this problem either by producing a single repair that is (nearly) optimal with respect...

متن کامل

Detecting Ambiguity in Prioritized Database Repairing

In its traditional definition, a repair of an inconsistent database is a consistent database that differs from the inconsistent one in a “minimal way.” Often, repairs are not equally legitimate, as it is desired to prefer one over another; for example, one fact is regarded more reliable than another, or a more recent fact should be preferred to an earlier one. Motivated by these considerations,...

متن کامل

Unambiguous Prioritized Repairing of Databases

In its traditional definition, a repair of an inconsistent databaseis a consistent database that differs from the inconsistent onein a “minimal way.” Often, repairs are not equally legiti-mate, as it is desired to prefer one over another; for example,one fact is regarded more reliable than another, or a morerecent fact should be preferred to an earlier one. Motivatedby t...

متن کامل

Pattern-Driven Data Cleaning

Data is inherently dirty and there has been a sustained effort to come up with different approaches to clean it. A large class of data repair algorithms rely on data-quality rules and integrity constraints to detect and repair the data. A well-studied class of integrity constraints is Functional Dependencies (FDs, for short) that specify dependencies among attributes in a relation. In this pape...

متن کامل

Problem Decomposition Method to Compute an Optimal Cover for a Set of Functional Dependencies

The paper proposes a problem decomposition method for building optimal cover for a set of functional dependencies to decrease the solving time. At the beginning, the paper includes an overview of the covers of functional dependencies. There are considered definitions and properties of non redundant covers for sets of functional dependencies, reduced and canonical covers as well as equivalence c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1712.07705  شماره 

صفحات  -

تاریخ انتشار 2017